LS-Join: Local Similarity Join on String Collections

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Link-based Similarity Join

Graphs can be found in applications like social networks, bibliographic networks, and biological databases. Understanding the relationship, or links, among graph nodes enables applications such as link prediction, recommendation, and spam detection. In this paper, we propose link-based similarity join (LS-join), which extends the similarity join operator to link-based measures. Given two sets o...

متن کامل

Efficient string similarity join in multi-core and distributed systems

In big data area a significant challenge about string similarity join is to find all similar pairs more efficiently. In this paper, we propose a parallel processing framework for efficient string similarity join. First, the input is split into some disjoint small subsets according to the joint frequency distribution and the interval distribution of strings. Then the filter-verification strategy...

متن کامل

A Novel SSPS Framework for String Similarity Join

As the enormous growth of information challenges the existing string analysis techniques for processing huge volume of data, there always seem to be a hope for newer inventions. Moreover, the problems encountered with the traditional methods such as low pruning power, increased false positives and poor scalability should be addressed with the appropriate solutions that cater to the need for imp...

متن کامل

Streaming Similarity Self-Join

We introduce and study the problem of computing the similarity self-join in a streaming context (sssj), where the input is an unbounded stream of items arriving continuously. The goal is to find all pairs of items in the stream whose similarity is greater than a given threshold. The simplest formulation of the problem requires unbounded memory, and thus, it is intractable. To make the problem f...

متن کامل

Probabilistic Similarity Join on Uncertain Data

An important database primitive for commonly used feature databases is the similarity join. It combines two datasets based on some similarity predicate into one set such that the new set contains pairs of objects of the two original sets. In many different application areas, e.g. sensor databases, location based services or face recognition systems, distances between objects have to be computed...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2017

ISSN: 1041-4347

DOI: 10.1109/tkde.2017.2687460